[DAGCombine] Fold vselect with splat zero #147305

badumbatish · 2025-07-07T13:56:43Z

Fold bitselect with splat <0> in either first or second
argument.
Fixes WASM: v128.bitselect When Argument is Zeroed Can Be Simplified #73454

llvmbot · 2025-07-07T13:57:18Z

@llvm/pr-subscribers-llvm-selectiondag
@llvm/pr-subscribers-backend-aarch64

@llvm/pr-subscribers-backend-webassembly

Author: jjasmine (badumbatish)

Changes

[WASM] Fold bitselect with argument <0>

Fixes #73454
Fold bitselect with the splat <0> in either first or second
argument.

For first argument <0>: vselect <0>, X, Y -> Y
For second argument <0>: vselect Y, <0>, X -> AND(<X>, !<Y>)
For third argument <0>: vselect X, Y, <0> -> AND(<Y>, <X>)

Detailed explanation in the implementation.

Full diff: https://github.com/llvm/llvm-project/pull/147305.diff

2 Files Affected:

(modified) llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp (+63)
(added) llvm/test/CodeGen/WebAssembly/simd-bitselect.ll (+48)

diff --git a/llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp b/llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
index bf2e04caa0a61..8009bfcc861c3 100644
--- a/llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
+++ b/llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
@@ -191,6 +191,9 @@ WebAssemblyTargetLowering::WebAssemblyTargetLowering(
     // Combine vector mask reductions into alltrue/anytrue
     setTargetDAGCombine(ISD::SETCC);
 
+    // Convert vselect of various zero arguments to AND
+    setTargetDAGCombine(ISD::VSELECT);
+
     // Convert vector to integer bitcasts to bitmask
     setTargetDAGCombine(ISD::BITCAST);
 
@@ -3210,6 +3213,64 @@ static SDValue performTruncateCombine(SDNode *N,
   return truncateVectorWithNARROW(OutVT, In, DL, DAG);
 }
 
+static SDValue performVSelectCombine(SDNode *N, SelectionDAG &DAG) {
+  // In the tablegen.td, vselect A B C -> bitselect B C A
+
+  // SCENARIO A
+  // vselect <0>, X, Y
+  // -> bitselect X, Y, <0>
+  // -> or (AND(X, <0>), AND(<Y>, !<0>))
+  // -> or (0, AND(<Y>, !<0>))
+  // -> AND(Y, !<0>)
+  // -> AND(Y, 1)
+  // -> Y
+
+  // SCENARIO B
+  // vselect Y, <0>, X
+  // -> bitselect <0>, X, Y
+  // -> or (AND(<0>, Y), AND(<X>, !<Y>))
+  // -> or (0, AND(<X>, !<Y>))
+  // -> AND(<X>, !<Y>)
+
+  // SCENARIO C
+  // vselect X, Y, <0>
+  // -> bitselect Y, <0>, X
+  // -> or (AND(Y, X), AND(<0>, !X))
+  // -> or (AND(Y, X), <0>)
+  // -> AND(Y, X)
+
+  using namespace llvm::SDPatternMatch;
+  assert(N->getOpcode() == ISD::VSELECT);
+
+  SDLoc DL(N);
+
+  SDValue Cond = N->getOperand(0), LHS = N->getOperand(1),
+          RHS = N->getOperand(2);
+  EVT NVT = N->getValueType(0);
+
+  APInt SplatValue;
+
+  // SCENARIO A
+  if (ISD::isConstantSplatVector(Cond.getNode(), SplatValue) &&
+      SplatValue.isZero())
+    return RHS;
+
+  // SCENARIO B
+  if (ISD::isConstantSplatVector(LHS.getNode(), SplatValue) &&
+      SplatValue.isZero())
+    return DAG.getNode(
+        ISD::AND, DL, NVT,
+        {RHS, DAG.getSExtOrTrunc(DAG.getNOT(DL, Cond, Cond.getValueType()), DL,
+                                 NVT)});
+
+  // SCENARIO C
+  if (ISD::isConstantSplatVector(RHS.getNode(), SplatValue) &&
+      SplatValue.isZero())
+    return DAG.getNode(ISD::AND, DL, NVT,
+                       {LHS, DAG.getSExtOrTrunc(Cond, DL, NVT)});
+  return SDValue();
+}
+
 static SDValue performBitcastCombine(SDNode *N,
                                      TargetLowering::DAGCombinerInfo &DCI) {
   using namespace llvm::SDPatternMatch;
@@ -3505,6 +3566,8 @@ WebAssemblyTargetLowering::PerformDAGCombine(SDNode *N,
   switch (N->getOpcode()) {
   default:
     return SDValue();
+  case ISD::VSELECT:
+    return performVSelectCombine(N, DCI.DAG);
   case ISD::BITCAST:
     return performBitcastCombine(N, DCI);
   case ISD::SETCC:
diff --git a/llvm/test/CodeGen/WebAssembly/simd-bitselect.ll b/llvm/test/CodeGen/WebAssembly/simd-bitselect.ll
new file mode 100644
index 0000000000000..7803709dc8b4a
--- /dev/null
+++ b/llvm/test/CodeGen/WebAssembly/simd-bitselect.ll
@@ -0,0 +1,48 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
+; RUN: llc < %s  -O3 -verify-machineinstrs -disable-wasm-fallthrough-return-opt -wasm-disable-explicit-locals -wasm-keep-registers -mattr=+simd128 | FileCheck %s
+target triple = "wasm32-unknown-unknown"
+
+define void @bitselect_first_zero(ptr %output, ptr  %input) {
+; CHECK-LABEL: bitselect_first_zero:
+; CHECK:         .functype bitselect_first_zero (i32, i32) -> ()
+; CHECK-NEXT:  # %bb.0: # %start
+; CHECK-NEXT:    v128.load $push6=, 0($1)
+; CHECK-NEXT:    local.tee $push5=, $2=, $pop6
+; CHECK-NEXT:    v128.const $push0=, 2139095040, 2139095040, 2139095040, 2139095040
+; CHECK-NEXT:    v128.and $push1=, $2, $pop0
+; CHECK-NEXT:    v128.const $push2=, 0, 0, 0, 0
+; CHECK-NEXT:    i32x4.ne $push3=, $pop1, $pop2
+; CHECK-NEXT:    v128.and $push4=, $pop5, $pop3
+; CHECK-NEXT:    v128.store 0($0), $pop4
+; CHECK-NEXT:    return
+start:
+  %input.val = load <4 x i32>, ptr %input, align 16
+  %0 = and <4 x i32> %input.val, splat (i32 2139095040)
+  %1 = icmp eq <4 x i32> %0, zeroinitializer
+  %2 = select <4 x i1> %1, <4 x i32> zeroinitializer, <4 x i32> %input.val
+  store <4 x i32> %2, ptr %output, align 16
+  ret void
+}
+
+
+define void @bitselect_second_zero(ptr %output, ptr %input) {
+; CHECK-LABEL: bitselect_second_zero:
+; CHECK:         .functype bitselect_second_zero (i32, i32) -> ()
+; CHECK-NEXT:  # %bb.0: # %start
+; CHECK-NEXT:    v128.load $push6=, 0($1)
+; CHECK-NEXT:    local.tee $push5=, $2=, $pop6
+; CHECK-NEXT:    v128.const $push0=, 2139095040, 2139095040, 2139095040, 2139095040
+; CHECK-NEXT:    v128.and $push1=, $2, $pop0
+; CHECK-NEXT:    v128.const $push2=, 0, 0, 0, 0
+; CHECK-NEXT:    i32x4.eq $push3=, $pop1, $pop2
+; CHECK-NEXT:    v128.and $push4=, $pop5, $pop3
+; CHECK-NEXT:    v128.store 0($0), $pop4
+; CHECK-NEXT:    return
+start:
+  %input.val = load <4 x i32>, ptr %input, align 16
+  %0 = and <4 x i32> %input.val, splat (i32 2139095040)
+  %1 = icmp eq <4 x i32> %0, zeroinitializer
+  %2 = select  <4 x i1> %1, <4 x i32> %input.val, <4 x i32> zeroinitializer
+  store <4 x i32> %2, ptr %output, align 16
+  ret void
+}

badumbatish · 2025-07-07T13:57:37Z

waiting for CI to pass before tagging

llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp

llvm/test/CodeGen/WebAssembly/simd-bitselect.ll

llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp

badumbatish · 2025-07-08T23:24:07Z

sorry about the conflict, I didn't catch that locally, will repush

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp

lukel97 · 2025-07-09T01:41:56Z

The title of the PR and description probably need to be changed since this isn't really a wasm PR anymore, it's more to do with DAGCombiner

RKSimon

please can you rebase after #147472 - I think what you're after is just the undef element handling that isConstantSplatVector gives you by default

RKSimon

Also, I think wasm is missing an hasAndNot override

lukel97 · 2025-07-09T10:12:29Z

please can you rebase after #147472 - I think what you're after is just the undef element handling that isConstantSplatVector gives you by default

I think it's the fact that build_vectors seem to be canonicalzied to splat_vectors, and combineVSelectWithAllOnesOrZeroes only seems to pick up build_vectors because it calls these:

bool ISD::isBuildVectorAllOnes(const SDNode *N) {
  return isConstantSplatVectorAllOnes(N, /*BuildVectorOnly*/ true);
}

bool ISD::isBuildVectorAllZeros(const SDNode *N) {
  return isConstantSplatVectorAllZeros(N, /*BuildVectorOnly*/ true);
}

Also, I think wasm is missing an hasAndNot override

Yes, that's exactly what we're looking for the second part of this PR on lines R13161. @badumbatish although wasm doesn't seem to have an andnot instruction, it might be worthwhile exploring if returning true is profitable for vectors anyway.

badumbatish · 2025-07-09T17:16:13Z

Ah this is unfortunate, I didn't realize the other PR was opened at the time as well. Will probably rebase edit before the mass change in tests

badumbatish · 2025-07-09T19:54:32Z

although wasm doesn't seem to have an andnot instruction, it might be worthwhile exploring if returning true is profitable for vectors anyway.

I think the current impl checks if the scalar bit size is equal before folding to avoid having extra sign extending. I'm not sure if this is enough.

Precommit test for isConstantSplatVectorAll in VSelect

- Use isConstantSplatVectorAll* in VSelect - Update tests to reflect this

lukel97

LGTM, thanks! Just some comments on the test

lukel97 · 2025-07-10T06:38:23Z

llvm/test/CodeGen/WebAssembly/simd-bitselect.ll

+; RUN: llc < %s  -O3 -verify-machineinstrs -disable-wasm-fallthrough-return-opt -wasm-disable-explicit-locals -wasm-keep-registers -mattr=+simd128 | FileCheck %s
+target triple = "wasm32-unknown-unknown"
+
+define <4 x i32> @bitselect_splat_first_zero_and_icmp(<4 x i32>  %input) {


Can we move these tests into simd-select.ll to keep it with the other select tests? And reuse the same variable names in that test e.g. use %x for the input, %c for the icmp and %res for the select

lukel97 · 2025-07-10T06:38:40Z

llvm/test/CodeGen/WebAssembly/simd-bitselect.ll

+; CHECK-NEXT:    i32x4.ne $push3=, $pop1, $pop2
+; CHECK-NEXT:    v128.and $push4=, $pop3, $0
+; CHECK-NEXT:    return $pop4
+start:


Nit, you can leave out the basic block labels if they're not used

Suggested change

start:

lukel97 · 2025-07-10T06:38:59Z

llvm/test/CodeGen/WebAssembly/simd-bitselect.ll

+; CHECK-NEXT:    v128.and $push3=, $pop2, $1
+; CHECK-NEXT:    return $pop3
+start:
+  %2 = select  <4 x i1> %cond, <4 x i32> %input, <4 x i32> zeroinitializer


Nit, some stray spaces

Suggested change

%2 = select <4 x i1> %cond, <4 x i32> %input, <4 x i32> zeroinitializer

%2 = select <4 x i1> %cond, <4 x i32> %input, <4 x i32> zeroinitializer

lukel97 · 2025-07-10T06:39:49Z

llvm/test/CodeGen/WebAssembly/simd-bitselect.ll

+  ret <4 x i32> %2
+}
+
+define <4 x i32> @bitselect_splat_second_zero_cond_input(<4 x i1> %cond, <4 x i32>  %input) {


Suggested change

define <4 x i32> @bitselect_splat_second_zero_cond_input(<4 x i1> %cond, <4 x i32> %input) {

define <4 x i32> @bitselect_splat_second_zero_cond_input(<4 x i1> %cond, <4 x i32> %input) {

RKSimon · 2025-07-10T06:56:47Z

Please can you update the summary

llvmbot added the backend:WebAssembly label Jul 7, 2025

lukel97 reviewed Jul 8, 2025

View reviewed changes

llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp Outdated Show resolved Hide resolved

lukel97 reviewed Jul 8, 2025

View reviewed changes

llvmbot added backend:AArch64 backend:X86 llvm:SelectionDAG SelectionDAGISel as well labels Jul 8, 2025

lukel97 reviewed Jul 9, 2025

View reviewed changes

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp Outdated Show resolved Hide resolved

RKSimon requested changes Jul 9, 2025

View reviewed changes

RKSimon reviewed Jul 9, 2025

View reviewed changes

badumbatish force-pushed the bitselect_v128_zero_simpl branch from 3dc890b to b3e58ea Compare July 9, 2025 19:48

badumbatish changed the title ~~[WASM] Fold bitselect with splat zero~~ [DAGCombine] Fold vselect with splat zero Jul 9, 2025

badumbatish added 2 commits July 9, 2025 20:45

[DAGCombine] Precommit test for isConstantSplatVectorAll

a9d15d1

Precommit test for isConstantSplatVectorAll in VSelect

[DAGCombine] Use isConstantSplatVectorAll in VSelect

1ad59c0

- Use isConstantSplatVectorAll* in VSelect - Update tests to reflect this

badumbatish force-pushed the bitselect_v128_zero_simpl branch from b3e58ea to 1ad59c0 Compare July 10, 2025 03:49

lukel97 approved these changes Jul 10, 2025

View reviewed changes

lukel97 requested review from RKSimon and rj-jesus July 10, 2025 06:40

	%2 = select <4 x i1> %cond, <4 x i32> %input, <4 x i32> zeroinitializer
	%2 = select <4 x i1> %cond, <4 x i32> %input, <4 x i32> zeroinitializer

	define <4 x i32> @bitselect_splat_second_zero_cond_input(<4 x i1> %cond, <4 x i32> %input) {
	define <4 x i32> @bitselect_splat_second_zero_cond_input(<4 x i1> %cond, <4 x i32> %input) {

[DAGCombine] Fold vselect with splat zero #147305

Are you sure you want to change the base?

[DAGCombine] Fold vselect with splat zero #147305

Conversation

badumbatish commented Jul 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Jul 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

badumbatish commented Jul 7, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

badumbatish commented Jul 8, 2025

Uh oh!

Uh oh!

lukel97 commented Jul 9, 2025

Uh oh!

RKSimon left a comment

Choose a reason for hiding this comment

Uh oh!

RKSimon left a comment

Choose a reason for hiding this comment

Uh oh!

lukel97 commented Jul 9, 2025

Uh oh!

badumbatish commented Jul 9, 2025

Uh oh!

badumbatish commented Jul 9, 2025

Uh oh!

lukel97 left a comment

Choose a reason for hiding this comment

Uh oh!

lukel97 Jul 10, 2025

Choose a reason for hiding this comment

Uh oh!

lukel97 Jul 10, 2025

Choose a reason for hiding this comment

Uh oh!

lukel97 Jul 10, 2025

Choose a reason for hiding this comment

Uh oh!

lukel97 Jul 10, 2025

Choose a reason for hiding this comment

Uh oh!

RKSimon commented Jul 10, 2025

Uh oh!

Uh oh!

badumbatish commented Jul 7, 2025 •

edited

Loading

llvmbot commented Jul 7, 2025 •

edited

Loading